Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 15(1): 3347, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637553

RESUMO

Neurons in the inferotemporal (IT) cortex respond selectively to complex visual features, implying their role in object perception. However, perception is subjective and cannot be read out from neural responses; thus, bridging the causal gap between neural activity and perception demands independent characterization of perception. Historically, though, the complexity of the perceptual alterations induced by artificial stimulation of IT cortex has rendered them impossible to quantify. To address this old problem, we tasked male macaque monkeys to detect and report optical impulses delivered to their IT cortex. Combining machine learning with high-throughput behavioral optogenetics, we generated complex and highly specific images that were hard for the animal to distinguish from the state of being cortically stimulated. These images, named "perceptograms" for the first time, reveal and depict the contents of the complex hallucinatory percepts induced by local neural perturbation in IT cortex. Furthermore, we found that the nature and magnitude of these hallucinations highly depend on concurrent visual input, stimulation location, and intensity. Objective characterization of stimulation-induced perceptual events opens the door to developing a mechanistic theory of visual perception. Further, it enables us to make better visual prosthetic devices and gain a greater understanding of visual hallucinations in mental disorders.


Assuntos
Lobo Temporal , Percepção Visual , Animais , Masculino , Humanos , Macaca mulatta/fisiologia , Percepção Visual/fisiologia , Lobo Temporal/fisiologia , Córtex Cerebral/fisiologia , Neurônios/fisiologia , Estimulação Luminosa
2.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 11382-11389, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37104111

RESUMO

The human ability to recognize when an object belongs or does not belong to a particular vision task outperforms all open set recognition algorithms. Human perception as measured by the methods and procedures of visual psychophysics from psychology provides an additional data stream for algorithms that need to manage novelty. For instance, measured reaction time from human subjects can offer insight as to whether a class sample is prone to be confused with a different class - known or novel. In this work, we designed and performed a large-scale behavioral experiment that collected over 200,000 human reaction time measurements associated with object recognition. The data collected indicated reaction time varies meaningfully across objects at the sample-level. We therefore designed a new psychophysical loss function that enforces consistency with human behavior in deep networks which exhibit variable reaction time for different images. As in biological vision, this approach allows us to achieve good open set recognition performance in regimes with limited labeled training data. Through experiments using data from ImageNet, significant improvement is observed when training Multi-Scale DenseNets with this new formulation: it significantly improved top-1 validation accuracy by 6.02%, top-1 test accuracy on known samples by 9.81%, and top-1 test accuracy on unknown samples by 33.18%. We compared our method to 10 open set recognition methods from the literature, which were all outperformed on multiple metrics.


Assuntos
Algoritmos , Percepção Visual , Humanos , Reconhecimento Psicológico , Reconhecimento Visual de Modelos
3.
Sci Rep ; 12(1): 18306, 2022 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-36316363

RESUMO

A great deal of the images found in scientific publications are retouched, reused, or composed to enhance the quality of the presentation. In most instances, these edits are benign and help the reader better understand the material in a paper. However, some edits are instances of scientific misconduct and undermine the integrity of the presented research. Determining the legitimacy of edits made to scientific images is an open problem that no current technology can perform satisfactorily in a fully automated fashion. It thus remains up to human experts to inspect images as part of the peer-review process. Nonetheless, image analysis technologies promise to become helpful to experts to perform such an essential yet arduous task. Therefore, we introduce SILA, a system that makes image analysis tools available to reviewers and editors in a principled way. Further, SILA is the first human-in-the-loop end-to-end system that starts by processing article PDF files, performs image manipulation detection on the automatically extracted figures, and ends with image provenance graphs expressing the relationships between the images in question, to explain potential problems. To assess its efficacy, we introduce a dataset of scientific papers from around the globe containing annotated image manipulations and inadvertent reuse, which can serve as a benchmark for the problem at hand. Qualitative and quantitative results of the system are described using this dataset.


Assuntos
Processamento de Imagem Assistida por Computador , Má Conduta Científica , Humanos , Publicações
4.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6594-6601, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34170823

RESUMO

In this paper, we consider how to incorporate psychophysical measurements of human visual perception into the loss function of a deep neural network being trained for a recognition task, under the assumption that such information can reduce errors. As a case study to assess the viability of this approach, we look at the problem of handwritten document transcription. While good progress has been made towards automatically transcribing modern handwriting, significant challenges remain in transcribing historical documents. Here we describe a general enhancement strategy, underpinned by the new loss formulation, which can be applied to the training regime of any deep learning-based document transcription system. Through experimentation, reliable performance improvement is demonstrated for the standard IAM and RIMES datasets for three different network architectures. Further, we go on to show feasibility for our approach on a new dataset of digitized Latin manuscripts, originally produced by scribes in the Cloister of St. Gall in the the 9th century.


Assuntos
Algoritmos , Redes Neurais de Computação , Escrita Manual , Humanos , Percepção
5.
IEEE Trans Image Process ; 30: 6892-6905, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34288871

RESUMO

Images from social media can reflect diverse viewpoints, heated arguments, and expressions of creativity, adding new complexity to retrieval tasks. Researchers working on Content-Based Image Retrieval (CBIR) have traditionally tuned their algorithms to match filtered results with user search intent. However, we are now bombarded with composite images of unknown origin, authenticity, and even meaning. With such uncertainty, users may not have an initial idea of what the search query results should look like. For instance, hidden people, spliced objects, and subtly altered scenes can be difficult for a user to detect initially in a meme image, but may contribute significantly to its composition. It is pertinent to design systems that retrieve images with these nuanced relationships in addition to providing more traditional results, such as duplicates and near-duplicates - and to do so with enough efficiency at large scale. We propose a new approach for spatial verification that aims at modeling object-level regions using image keypoints retrieved from an image index, which is then used to accurately weight small contributing objects within the results, without the need for costly object detection steps. We call this method the Objects in Scene to Objects in Scene (OS2OS) score, and it is optimized for fast matrix operations, which can run quickly on either CPUs or GPUs. It performs comparably to state-of-the-art methods on classic CBIR problems (Oxford 5K, Paris 6K, and Google-Landmarks), and outperforms them in emerging retrieval tasks such as image composite matching in the NIST MFC2018 dataset and meme-style imagery from Reddit.

6.
Sci Rep ; 11(1): 1002, 2021 01 13.
Artigo em Inglês | MEDLINE | ID: mdl-33441714

RESUMO

The analysis of fish behavior in response to odor stimulation is a crucial component of the general study of cross-modal sensory integration in vertebrates. In zebrafish, the centrifugal pathway runs between the olfactory bulb and the neural retina, originating at the terminalis neuron in the olfactory bulb. Any changes in the ambient odor of a fish's environment warrant a change in visual sensitivity and can trigger mating-like behavior in males due to increased GnRH signaling in the terminalis neuron. Behavioral experiments to study this phenomenon are commonly conducted in a controlled environment where a video of the fish is recorded over time before and after the application of chemicals to the water. Given the subtleties of behavioral change, trained biologists are currently required to annotate such videos as part of a study. This process of manually analyzing the videos is time-consuming, requires multiple experts to avoid human error/bias and cannot be easily crowdsourced on the Internet. Machine learning algorithms from computer vision, on the other hand, have proven to be effective for video annotation tasks because they are fast, accurate, and, if designed properly, can be less biased than humans. In this work, we propose to automate the entire process of analyzing videos of behavior changes in zebrafish by using tools from computer vision, relying on minimal expert supervision. The overall objective of this work is to create a generalized tool to predict animal behaviors from videos using state-of-the-art deep learning models, with the dual goal of advancing understanding in biology and engineering a more robust and powerful artificial information processing system for biologists.


Assuntos
Comportamento Animal/fisiologia , Bulbo Olfatório/fisiologia , Peixe-Zebra/fisiologia , Algoritmos , Animais , Computadores , Feminino , Masculino , Neurônios/fisiologia , Odorantes , Retina/fisiologia
7.
IEEE Trans Pattern Anal Mach Intell ; 43(12): 4272-4290, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-32750769

RESUMO

What is the current state-of-the-art for image restoration and enhancement applied to degraded images acquired under less than ideal circumstances? Can the application of such algorithms as a pre-processing step improve image interpretability for manual analysis or automatic visual recognition to classify scene content? While there have been important advances in the area of computational photography to restore or enhance the visual quality of an image, the capabilities of such techniques have not always translated in a useful way to visual recognition tasks. Consequently, there is a pressing need for the development of algorithms that are designed for the joint problem of improving visual appearance and recognition, which will be an enabling factor for the deployment of visual recognition tools in many real-world scenarios. To address this, we introduce the UG 2 dataset as a large-scale benchmark composed of video imagery captured under challenging conditions, and two enhancement tasks designed to test algorithmic impact on visual quality and automatic object recognition. Furthermore, we propose a set of metrics to evaluate the joint improvement of such tasks as well as individual algorithmic advances, including a novel psychophysics-based evaluation regime for human assessment and a realistic set of quantitative measures for object recognition performance. We introduce six new algorithms for image restoration or enhancement, which were created as part of the IARPA sponsored UG 2 Challenge workshop held at CVPR 2018. Under the proposed evaluation regime, we present an in-depth analysis of these algorithms and a host of deep learning-based and classic baseline approaches. From the observed results, it is evident that we are in the early days of building a bridge between computational photography and visual recognition, leaving many opportunities for innovation in this area.

8.
Artigo em Inglês | MEDLINE | ID: mdl-32224457

RESUMO

Existing enhancement methods are empirically expected to help the high-level end computer vision task: however, that is observed to not always be the case in practice. We focus on object or face detection in poor visibility enhancements caused by bad weathers (haze, rain) and low light conditions. To provide a more thorough examination and fair comparison, we introduce three benchmark sets collected in real-world hazy, rainy, and low-light conditions, respectively, with annotated objects/faces. We launched the UG2+ challenge Track 2 competition in IEEE CVPR 2019, aiming to evoke a comprehensive discussion and exploration about whether and how low-level vision techniques can benefit the high-level automatic visual recognition in various scenarios. To our best knowledge, this is the first and currently largest effort of its kind. Baseline results by cascading existing enhancement and detection models are reported, indicating the highly challenging nature of our new data as well as the large room for further technical innovations. Thanks to a large participation from the research community, we are able to analyze representative team solutions, striving to better identify the strengths and limitations of existing mindsets as well as the future directions.

9.
IEEE Trans Image Process ; 29(1): 2150-2165, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31613762

RESUMO

In this paper we address the problem of hallucinating high-resolution facial images from low-resolution inputs at high magnification factors. We approach this task with convolutional neural networks (CNNs) and propose a novel (deep) face hallucination model that incorporates identity priors into the learning procedure. The model consists of two main parts: i) a cascaded super-resolution network that upscales the low-resolution facial images, and ii) an ensemble of face recognition models that act as identity priors for the super-resolution network during training. Different from most competing super-resolution techniques that rely on a single model for upscaling (even with large magnification factors), our network uses a cascade of multiple SR models that progressively upscale the low-resolution images using steps of 2× . This characteristic allows us to apply supervision signals (target appearances) at different resolutions and incorporate identity constraints at multiple-scales. The proposed C-SRIP model (Cascaded Super Resolution with Identity Priors) is able to upscale (tiny) low-resolution images captured in unconstrained conditions and produce visually convincing results for diverse low-resolution inputs. We rigorously evaluate the proposed model on the Labeled Faces in the Wild (LFW), Helen and CelebA datasets and report superior performance compared to the existing state-of-the-art.


Assuntos
Aprendizado Profundo , Face/anatomia & histologia , Face/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Bases de Dados Factuais , Humanos , Redes Neurais de Computação
10.
Artigo em Inglês | MEDLINE | ID: mdl-30863298

RESUMO

We propose a computational model of vision that describes the integration of cross-modal sensory information between the olfactory and visual systems in zebrafish based on the principles of the statistical extreme value theory. The integration of olfacto-retinal information is mediated by the centrifugal pathway that originates from the olfactory bulb and terminates in the neural retina. Motivation for using extreme value theory stems from physiological evidence suggesting that extremes and not the mean of the cell responses direct cellular activity in the vertebrate brain. We argue that the visual system, as measured by retinal ganglion cell responses in spikes/sec, follows an extreme value process for sensory integration and the increase in visual sensitivity from the olfactory input can be better modeled using extreme value distributions. As zebrafish maintains high evolutionary proximity to mammals, our model can be extended to other vertebrates as well.

11.
IEEE Trans Pattern Anal Mach Intell ; 41(9): 2280-2286, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-29994469

RESUMO

By providing substantial amounts of data and standardized evaluation protocols, datasets in computer vision have helped fuel advances across all areas of visual recognition. But even in light of breakthrough results on recent benchmarks, it is still fair to ask if our recognition algorithms are doing as well as we think they are. The vision sciences at large make use of a very different evaluation regime known as Visual Psychophysics to study visual perception. Psychophysics is the quantitative examination of the relationships between controlled stimuli and the behavioral responses they elicit in experimental test subjects. Instead of using summary statistics to gauge performance, psychophysics directs us to construct item-response curves made up of individual stimulus responses to find perceptual thresholds, thus allowing one to identify the exact point at which a subject can no longer reliably recognize the stimulus class. In this article, we introduce a comprehensive evaluation framework for visual recognition models that is underpinned by this methodology. Over millions of procedurally rendered 3D scenes and 2D images, we compare the performance of well-known convolutional neural networks. Our results bring into question recent claims of human-like performance, and provide a path forward for correcting newly surfaced algorithmic deficiencies.

12.
Sci Rep ; 8(1): 17585, 2018 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-30498261

RESUMO

A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has been fixed in the paper.

13.
Sci Rep ; 8(1): 14247, 2018 09 24.
Artigo em Inglês | MEDLINE | ID: mdl-30250218

RESUMO

Imaging is a dominant strategy for data collection in neuroscience, yielding stacks of images that often scale to gigabytes of data for a single experiment. Machine learning algorithms from computer vision can serve as a pair of virtual eyes that tirelessly processes these images, automatically detecting and identifying microstructures. Unlike learning methods, our Flexible Learning-free Reconstruction of Imaged Neural volumes (FLoRIN) pipeline exploits structure-specific contextual clues and requires no training. This approach generalizes across different modalities, including serially-sectioned scanning electron microscopy (sSEM) of genetically labeled and contrast enhanced processes, spectral confocal reflectance (SCoRe) microscopy, and high-energy synchrotron X-ray microtomography (µCT) of large tissue volumes. We deploy the FLoRIN pipeline on newly published and novel mouse datasets, demonstrating the high biological fidelity of the pipeline's reconstructions. FLoRIN reconstructions are of sufficient quality for preliminary biological study, for example examining the distribution and morphology of cells or extracting single axons from functional data. Compared to existing supervised learning methods, FLoRIN is one to two orders of magnitude faster and produces high-quality reconstructions that are tolerant to noise and artifacts, as is shown qualitatively and quantitatively.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Aprendizado de Máquina , Algoritmos , Animais , Camundongos , Síncrotrons/instrumentação , Microtomografia por Raio-X/métodos
14.
Artigo em Inglês | MEDLINE | ID: mdl-30130187

RESUMO

Prior art has shown it is possible to estimate, through image processing and computer vision techniques, the types and parameters of transformations that have been applied to the content of individual images to obtain new images. Given a large corpus of images and a query image, an interesting further step is to retrieve the set of original images whose content is present in the query image, as well as the detailed sequences of transformations that yield the query image given the original images. This is a problem that recently has received the name of image provenance analysis. In these times of public media manipulation (e.g., fake news and meme sharing), obtaining the history of image transformations is relevant for fact checking and authorship verification, among many other applications. This article presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is custom-tailored for the problem at hand, as well as novel techniques for obtaining the provenance graph that expresses how the images, as nodes, are ancestrally connected. A comprehensive set of experiments for each stage of the pipeline is provided, comparing the proposed solution with state-of-the-art results, employing previously published datasets. In addition, this work introduces a new dataset of real-world provenance cases from the social media site Reddit, along with baseline results.

15.
Sci Rep ; 8(1): 5397, 2018 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-29599461

RESUMO

Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.


Assuntos
Encéfalo/fisiologia , Aprendizado de Máquina , Encéfalo/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética
16.
IEEE Trans Pattern Anal Mach Intell ; 40(3): 762-768, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-28541894

RESUMO

It is often desirable to be able to recognize when inputs to a recognition function learned in a supervised manner correspond to classes unseen at training time. With this ability, new class labels could be assigned to these inputs by a human operator, allowing them to be incorporated into the recognition function-ideally under an efficient incremental update mechanism. While good algorithms that assume inputs from a fixed set of classes exist, e.g. , artificial neural networks and kernel machines, it is not immediately obvious how to extend them to perform incremental learning in the presence of unknown query classes. Existing algorithms take little to no distributional information into account when learning recognition functions and lack a strong theoretical foundation. We address this gap by formulating a novel, theoretically sound classifier-the Extreme Value Machine (EVM). The EVM has a well-grounded interpretation derived from statistical Extreme Value Theory (EVT), and is the first classifier to be able to perform nonlinear kernel-free variable bandwidth incremental learning. Compared to other classifiers in the same deep network derived feature space, the EVM is accurate and efficient on an established benchmark partition of the ImageNet dataset.

17.
Elife ; 52016 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-27383271

RESUMO

Resolving patterns of synaptic connectivity in neural circuits currently requires serial section electron microscopy. However, complete circuit reconstruction is prohibitively slow and may not be necessary for many purposes such as comparing neuronal structure and connectivity among multiple animals. Here, we present an alternative strategy, targeted reconstruction of specific neuronal types. We used viral vectors to deliver peroxidase derivatives, which catalyze production of an electron-dense tracer, to genetically identify neurons, and developed a protocol that enhances the electron-density of the labeled cells while retaining the quality of the ultrastructure. The high contrast of the marked neurons enabled two innovations that speed data acquisition: targeted high-resolution reimaging of regions selected from rapidly-acquired lower resolution reconstruction, and an unsupervised segmentation algorithm. This pipeline reduces imaging and reconstruction times by two orders of magnitude, facilitating directed inquiry of circuit motifs.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Microscopia Eletrônica/métodos , Microtomia/métodos , Rede Nervosa/anatomia & histologia , Vias Neurais/anatomia & histologia , Retina/citologia , Coloração e Rotulagem/métodos , Animais , Feminino , Masculino , Camundongos Endogâmicos C57BL
18.
IEEE Trans Pattern Anal Mach Intell ; 36(11): 2317-24, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26353070

RESUMO

Real-world tasks in computer vision often touch upon open set recognition: multi-class recognition with incomplete knowledge of the world and many unknown inputs. Recent work on this problem has proposed a model incorporating an open space risk term to account for the space beyond the reasonable support of known classes. This paper extends the general idea of open space risk limiting classification to accommodate non-linear classifiers in a multiclass setting. We introduce a new open set recognition model called compact abating probability (CAP), where the probability of class membership decreases in value (abates) as points move from known data toward open space. We show that CAP models improve open set recognition for multiple algorithms. Leveraging the CAP formulation, we go on to describe the novel Weibull-calibrated SVM (W-SVM) algorithm, which combines the useful properties of statistical extreme value theory for score calibration with one-class and binary support vector machines. Our experiments show that the W-SVM is significantly better for open set object detection and OCR problems when compared to the state-of-the-art for the same tasks.

19.
IEEE Trans Pattern Anal Mach Intell ; 36(8): 1679-86, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26353347

RESUMO

For many problems in computer vision, human learners are considerably better than machines. Humans possess highly accurate internal recognition and learning mechanisms that are not yet understood, and they frequently have access to more extensive training data through a lifetime of unbiased experience with the visual world. We propose to use visual psychophysics to directly leverage the abilities of human subjects to build better machine learning systems. First, we use an advanced online psychometric testing platform to make new kinds of annotation data available for learning. Second, we develop a technique for harnessing these new kinds of information-"perceptual annotations"-for support vector machines. A key intuition for this approach is that while it may remain infeasible to dramatically increase the amount of data and high-quality labels available for the training of a given system, measuring the exemplar-by-exemplar difficulty and pattern of errors of human annotators can provide important information for regularizing the solution of the system at hand. A case study for the problem face detection demonstrates that this approach yields state-of-the-art results on the challenging FDDB data set.


Assuntos
Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos , Reconhecimento Visual de Modelos/fisiologia , Curadoria de Dados , Bases de Dados Factuais , Face/anatomia & histologia , Feminino , Humanos , Masculino , Psicofísica , Máquina de Vetores de Suporte
20.
IEEE Trans Pattern Anal Mach Intell ; 35(7): 1757-72, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23682001

RESUMO

To date, almost all experimental evaluations of machine learning-based recognition algorithms in computer vision have taken the form of "closed set" recognition, whereby all testing classes are known at training time. A more realistic scenario for vision applications is "open set" recognition, where incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. This paper explores the nature of open set recognition and formalizes its definition as a constrained minimization problem. The open set recognition problem is not well addressed by existing algorithms because it requires strong generalization. As a step toward a solution, we introduce a novel "1-vs-set machine," which sculpts a decision space from the marginal distances of a 1-class or binary SVM with a linear kernel. This methodology applies to several different applications in computer vision where open set recognition is a challenging problem, including object recognition and face verification. We consider both in this work, with large scale cross-dataset experiments performed over the Caltech 256 and ImageNet sets, as well as face matching experiments performed over the Labeled Faces in the Wild set. The experiments highlight the effectiveness of machines adapted for open set evaluation compared to existing 1-class and binary SVMs for the same tasks.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Máquina de Vetores de Suporte , Animais , Identificação Biométrica , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...